Herme the Collection of Natural Speech Data

نویسنده

  • Nick Campbell
چکیده

This paper describes our approach to the collection of ‘natural’ (i.e., representative) data from spoken interactions in a social setting in the context of the development (through time) of expressive speech synthesis. Over the past ten years or so, we have collected several corpora of unprompted social conversations that illustrate the ‘contact’ element of speech that was lacking in many of the corpora collected by use of a specific ‘task’ with paid participants. The paper discusses the technical and ethical issues of collecting such spoken material, and highlights some of the problems we have encountered in the processing of this much-needed data. Through the use of attractive conversational devices, we have found that natural human curiosity, and an element of social programming combine to provide us with a rich source of material that complements the task-based collections from paid informants.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Gadamer’s Ambivalence toward the Enlightenment Project

This essay explores Gadamer’s ambivalent relationship with modernity.  Gadamer is a prominent critic of the Enlightenment project.  His criticisms are both theoretical and practical.  Theoretically, representationalism is at the center of modern epistemology for Gadamer.  Practically, Gadamer sees the demotion of prudence (phronesis) as fundamental to the “bad” Enlightenment.  Gadamer’s attempt...

متن کامل

Multi-Site Data Collection for a Spoken Language Corpus

This paper describes a recently collected spoken language corpus for the ATIS (Air Travel Information System) domain. This data collection effort has been co-ordinated by MADCOW (Multi-site ATIS Data COllection Working group). We summarize the motivation for this effort, the goals, the implementation of a multi-site data collection paradigm, and the accomplishments of MADCOW in monitoring the c...

متن کامل

A Contrastive Study of Request Speech Act in English and Persian Novels: Natural Semantic Metalanguage Approach

The Natural Semantic Metalanguage (NSM) Approach claims that there are some universalities in all languages. Speech acts seem to be present in all languages, but considering this approach, research has not indicated whether request speech act differs from one language to another. Thus, this study intended to investigate whether request strategies are used differently in English and Persian roma...

متن کامل

A Contrastive Study of Request Speech Act in English and Persian Novels: Natural Semantic Metalanguage Approach

The Natural Semantic Metalanguage (NSM) Approach claims that there are some universalities in all languages. Speech acts seem to be present in all languages, but considering this approach, research has not indicated whether request speech act differs from one language to another. Thus, this study intended to investigate whether request strategies are used differently in English and Persian roma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016